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Abstract. An instance of the Connected Maximum Cut problem consists of an undi¬ 
rected graph G = (U, E) and the goal is to find a subset of vertices S' C U that 
maximizes the number of edges in the cut 5{S) such that the induced graph G[S] is 
connected. We present the first non-trivial approximation algorithm for the 

Connected Maximum Cut problem in general graphs using novel techniques. We then 
extend our algorithm to edge weighted case and obtain a poly-logarithmic approx¬ 
imation algorithm. Interestingly, in contrast to the classical Max-Cut problem that 
can be solved in polynomial time on planar graphs, we show that the Connected Max¬ 
imum Cut problem remains NP-hard on unweighted, planar graphs. On the positive 
side, we obtain a polynomial time approximation scheme for the Connected Maximum 
Cut problem on planar graphs and more generally on bounded genus graphs. 


1 Introduction 

Submodular optimization problems have, in recent years, received a considerable amount of 
attention [1, 2, 3, 4, 7, 15, 27] in algorithmic research. In a general Submodular Maximization 
problem, we are given a non-negative submodular^ function over the power set of a universe 
U of elements, / : 2^ —>■ IR+U{0} and the goal is to find a subset S C U that maximizes f{S) 
so that S satisfies certain pre-specified constraints. In addition to their practical relevance, 
the study of submodular maximization problems has led to the development of several 
important theoretical techniques such as the continuous greedy method and multi-linear 
extensions [4] and the double greedy [2] algorithm, among others. 

In this study, we are interested in the problem of maximizing a submodular set func¬ 
tion over vertices of a graph, such that the selected vertices induce a connected subgraph. 
Motivated by applications in coverage over wireless networks, Kuo et al. [25] consider the 
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A function / is called submodular if f{S) -I- f{T) > f{S U T) -|- f{S n T) for all S,T C U. 



problem of maximizing a monotone, submodular function / subject to connectivity and car¬ 
dinality constraints of the form [S'! < k and provide an approximation algorithm. For 

a restricted class of monotone, submodular functions that includes the covering function^, 
Khuller et al. [23] give a constant factor approximation to the problem of maximizing / 
subject to connectivity and cardinality constraints. 

In the light of these results, it is rather surprising that no non-trivial approximation 
algorithms are known for the case of general (non-monotone) submodular functions. For¬ 
mally, we are interested in the following problem, which we refer to as Connected Submodular 
Maximization (CSM): Given a simple, undirected graph G = {V,E) and a non-negative sub¬ 
modular set function f : 2^ —>■ IR+ U {0}, find a subset of vertices S C V that maximizes 
f{S) such that G[S'] is connected. We take the first but important step in this direction 
and study the problem in the case of one of the most important non-monotone submodular 
functions, namely the Cut function. Formally, given an undirected graph G = {V,E), the 
goal is to find a subset S C V, such that G[S'] is connected and the number of edges that 
have exactly one end point in S, referred to as the cut function 6{S), is maximized. We 
refer to this as the Connected Maximum Cut problem. Further, we also consider an edge 
weighted variant of this problem, called the Weighted Connected Maximum Cut problem, 
where function to be maximized is the total weight of edges in the cut 6{S). 

We now outline an application to the image segmentation problem that seeks to identify 
“objects” in an image. Graph based approaches for image segmentation [16, 28] represent 
each pixel as a vertex and weighted edges represent the dissimilarity (or similarity depending 
on the application) between adjacent pixels. Given such a graph, a connected set of pixels 
with a large weighted cut naturally corresponds to an object in the image. Vicente et al. [33] 
show that even for interactive image segmentation, techniques that require connectivity also 
perform significantly better that cut based methods alone. 


Related Work 

Max-Cut is a fundamental problem in combinatorial optimization that finds applications in 
diverse areas. A simple randomized algorithm that adds each vertex to S independently 
with probability 1/2 gives a 0.5-approximate solution in expectation. In a breakthrough re¬ 
sult, Goemans and Williamson [18] gave a 0.878-approximation algorithm using semidefinite 
programming and randomized rounding. Further, Khot et al. [22] showed that this factor is 
optimal assuming the Unique Games Conjecture. Interestingly, the Max-Cut problem can be 
optimally solved in polynomial time in planar graphs by a curious connection to the match¬ 
ing problem in the dual graph [20]. To the best of our knowledge, the Connected Maximum 
Cut problem has not been considered before our work. Haglin and Venkatesan [21] showed 
that a related problem, where we require both sides of the cut, namely S and U \ S', to be 
connected, is NP-hard in planar graphs. 

We note that the well studied Maximum Leaf Spanning Tree (MLST) problem (e.g. see 
[31]) is a special case of the Connected Submodular Maximization problem. We also note 
that recent work on graph connectivity under vertex sampling leads to a simple constant 
approximation to the Connected Submodular Maximization for highly connected graphs, i.e., 
for graphs with f7(logn) vertex connectivity. Proofs of these claims are presented in the 
Appendix B and G respectively. 

® In this context, a covering function is defined as f{S) = X)t,GN+(S) weight{v) where N''"(S) is the 
closed neighborhood of the set of vertices S 



We conclude this section by noting that connected variants of many classical combina¬ 
torial problems have been extensively studied in the literature and have been found to be 
useful. The best example for this is the Connected Dominating Set problem. Following the 
seminal work of Guha and Khuller [19], the problem has found extensive applications (with 
more than a thousand citations) in the domain of wireless ad hoc networks as a virtual 
backbone (e.g. see [9, 12]). Few other examples of connected variants of classic optimization 
problems include Group Steiner Tree [17] (which can be seen as a generalization of a connected 
variant of Set Cover), Connected Domatic Partition [5, 6], Connected Facility Location [13, 32], 
and Connected Vertex Cover [8]. 


Contribution and Techniques 

Our key results can be summarized as follows. 

1. We obtain the first ■0(p;^) approximation algorithm for the Connected Maximum 
Cut (CMC) problem in general graphs. Often, for basic connectivity problems on graphs, 
one can obtain simple O(logn) approximation algorithms using a probabilistic embedding 
into trees with O(logn) stretch [14]. Similarly, using the cut-based decompositions given 
by Racke [29], one can obtain O(logn) approximation algorithms for cut problems (e.g. 
Minimum Bisection). Interestingly, since the CMC problem has the flavors of both cut and 
connectivity problems simultaneously, neither of these approaches are applicable. Our novel 
approach is to look for a-thick trees, which are basically sub-trees with “high” degree sum 
on the leaves. 

2. For the Weighted Connected Maximum Cut problem, we obtain an ^( 10^2 „ ) approxi¬ 
mation algorithm. The basic idea is to group the edges into logarithmic number of weight 
classes and show that the problem on each weight class boils down to the special case where 
the weight of every edge is either 0 or 1. 

3. We obtain a polynomial time approximation scheme for the CMC problem in pla¬ 
nar graphs and more generally in bounded genus graphs. This requires the application of a 
stronger form of the edge contraction theorem by Demaine, Hajiaghayi and Kawarabayashi [11] 
that may be of independent interest. 

4. We show that the CMC problem remains NP-hard even on unweighted, planar graphs. 
This is in stark contrast with the regular Max-Cut problem that can be solved optimally in 
planar graphs in polynomial time. We obtain a polynomial time reduction from a special 
case of 3-SAT called the Planar Monotone 3-SAT (PM-3SAT), to the CMC problem in planar 
graphs. This entails a delicate construction, exploiting the so called “rectilinear representa¬ 
tion” of a PM-3SAT instance, to maintain planarity of the resulting CMC instance. 

2 Approximation Algorithms for General Graphs 

In this section, we consider the Connected Maximum Cut problem in general graphs. In fact, 
we provide an t2(p;|y^) approximation algorithm for the more general problem in which 
edges can have weight 0 or 1 and the objective is to maximize the number of edges of 
weight 1 in the cut. This generalization will be useful later in obtaining a poly-logarithmic 
approximation algorithm for arbitrary weighted graphs. 

We denote the cut of a subset of vertices S' in a graph G, i.e., the set of edges in G 
that are incident on exactly one vertex of S by 5 g(S) or when G is clear from context, just 
5{S). Further, for two disjoint subsets of vertices Si and S 2 in G, we denote the set of edges 



that have one end point in each of Si and S' 2 , by SaiSi, S 2 ) or simply d{Si, S' 2 ). The formal 
problem definition follows - 

Problem Definition. {O,l}-Connected Maximum Cut (b-CMC): Given a graph G = (V,E) 
and a weight function w : E ^ {0, 1}, find a set S C V that maximizes such 

that G[S'] induces a connected subgraph. 

We call an edge of weight 0, a 0-edge and that of weight 1, a 1-edge. Further, let w{S{S)) = 
J2eeS(S) denote the weight of the cut, i.e., the number of 1-edges in the cut. We first 
start with a simple reduction rule that ensures that every vertex v G V has at least one 
1-edge incident on it. 

Claim 1 Given a graph G = {V,E), we can construct a graph G' = {V,E') in polynomial 
time, such that every v' € V has at least one 1-edge incident on it and G' has a b-CMC 
solution S' of weight at least ip if and only if G has a b-CMC solution S of weight at least 
Ip. 

Proof. Let u G F be a vertex in G that has only 0-edges incident on it and let {ui, V 2 ,..., v/} 
denote the set of its neighbors. Consider the graph G' obtained from G by deleting v along 
with all its incident edges and adding 0-edges between every pair of its neighbors {vi,Vj} 
such that {vi,Vj} ^ E. Let S denote a feasible solution of weight ip in G. U v ^ S, then 
clearly S' = S is the required solution in G'. If u € S, we set S' = <S'\ {v} and we claim that 
G'[S"] is connected if G[5'] is connected and X]ee<5 /(S')'^(®) = Seeicl-S) latter 

part of the claim is true since all the edges that we delete and add are 0-edges. To prove 
the former part, notice that if v is not a cut vertex in G[5'] then G[S"] must be connected. 
On the other hand, even if u is a cut vertex, the new edges added among all pairs of v’s 
neighbors ensure that G[5"] is connected. Finally, to prove the other direction, suppose we 
have a feasible solution S' of weight ip in G'. Now, if G[S"] is connected, then S = S' is 
a feasible solution in G of weight ip. Otherwise, set S = S' U {w}. Since v creates a path 
between all pairs of its neighbors, G[S'] is connected if G'[S"] is connected and is thus a 
feasible solution of the same weight. The proof of the lemma follows from induction. □ 

From now on, we will assume, without loss of generality, that every vertex of G has at 
least one 1-edge incident on it. We now introduce some new definitions that would help 
us to present the main algorithmic ideas. We denote by Wg{v) the total weight of edges 
incident on a vertex v in G, i.e., Wciv) = other words, Wciv) is total 

number of 1-edges incident on v. Further let t] be the total number of 1-edges in the graph. 
The following notion of an o-thick tree is a crucial component of our algorithm. 

Definition 1 (o-Thick Tree). Let G = {V,E) he a graph with n vertices and rj 1-edges. 
A subtree T C G (not necessarily spanning), with leaf set L, is said to be a-thick if 
’EveL^civ) > ap. 

The following lemma shows that this notion of an a-thick tree is intimately connected 
with the b-CMC problem. 

Lemma 1 For any a > 0, given a polynomial time algorithm A that computes an a-thick 
tree T of a graph G, we can obtain an j-approximation algorithm for the b-CMC problem 
on G. 



Proof. Given a graph G = {V, E) and weight function w : if —>■ {0,1}, we use Algorithm A 
to compute an a-thick tree T, with leaf set L. Let rriL denote the number of 1-edges in G[L], 
the subgraph induced by L in the graph G. We now partition L into two disjoint sets Li 
and L 2 such that the number of 1-edges in 6 {Li,L 2 ) > This can be done by applying 
the standard randomized algorithm for Max-Cut (e.g. see [26]) on G[L] after deleting all 
the 0-edges. Now, consider the two connected subgraphs T \ Li and T \ ^ 2 - We first claim 
that every 1-edge in 6 {L) belongs to either 6 {T \ Li) or 6 {T \ L 2 ). Indeed, any 1-edge e in 
(5(L), belongs to one of the four possible sets, namely S{L 2 ,T\L), S{Li,V\T), S{Li,T\L) 
and S{L 2 , V \ T). In the first two cases, e belongs to 5{T \ L 2 ) while in the last two cases, 
e belongs 5{T \ Li), hence the claim. Further, every 1-edge in 6 {Li,L 2 ) belongs to both 
S{T \ Li) and 5{T \ L 2 ). Hence, we have - 

w{e)+ Y w{e)= Y w{e) + 2 Y ^(e) 

ee 5 (T\Li) eeS(T\L2) eeS(L) eeS{Li,L2) 

> Y w{e) + mL>^Y^G{v)>^ 

eGS{L) vGL 

Hence, the better of the two solutions T\Li or T \ L 2 is guaranteed to have a cut of weight 
at least where 77 is the total number of 1-edges in G. To complete the proof we note 
that for any optimal solution OPT, w{S{OPT)) < rj. □ 

Thus, if we have an algorithm to compute a-thick trees. Lemma 1 provides an f2(a)- 
approximation algorithm for the b-CMC problem. Unfortunately, there exist graphs that do 
not contain a-thick trees for any non-trivial value of a. For example, let G be a path graph 
with n vertices and m = n — 1 1-edges. It is easy to see that for any subtree T, the sum of 
degrees of the leaves is at most 4. In spite of this setback, we show that the notion of a-thick 
trees is still useful in obtaining a good approximation algorithm for the b-CMC problem. 
In particular. Lemma 3 and Theorem 1 show that path graph is the only bad case, i.e., if 
the graph G does not have a long induced path, then one can find an I7(j^^i^)-thick tree. 
Lemma 2 shows that we can assume without loss of generality that the b-CM(f instance does 
not have such a long induced path. 


( 1 ) 

( 2 ) 


Shrinking Thin Paths. A natural idea to handle the above “bad” case is to get rid of 
such long paths that contain only vertices of degree two by contracting the edges. We refer 
to a path that only contains vertices of degree two as a d-2 path. Further, we define the 
length of a d-2 path as the number of vertices (of degree two) that it contains. The following 
lemma shows that we can assume without loss of generality that the graph G contains no 
“long” d-2 paths. 

Lemma 2 Given a graph G, we can construct, in polynomial time, a graph G' with no d-2 
paths of length > 3 such that G' has a b-CMC solution S' of cut weight (w{5{S'))) at least if 
if and only if G has a b-CMC solution S of cut weight at least if. Further, given the solution 
S' of G', we can recover S in polynomial time. 

Proof. We may assume that G is connected, because otherwise we can handle each com¬ 
ponent separately. We further assume that G is not a simple cycle, otherwise it is trivial 
to solve such an instance. If G does not have a d-2 path of length > 3, then trivially we 



have G' = G. Otherwise, let p = [foj cq, fi, ei, W 2 , 62 , W 3 ] be a path in G such that vi,V 2 
and V 3 have degree two and deg{vo) 7 ^ 2. Note that such a path p must exist as G is not a 
simple cycle. We now perform the following operation on G to obtain a new graph Gnew- 
Delete these elements {cq, Wi, ei, r' 2 , 62 }. Add a new vertex Vnew and edges Cg = (vo,Vnew) 
and = (vnew, fa)- Since deg{vo) 7 ^ 2 and degiv^) = 2, we are guaranteed that vg 7 ^ V 3 and 
hence we do not introduce any multi-edges. The weights on the new edges are determined 
as follows - Let Up denote the number of 1-edges in Ep = { 60 , 61 , 62 }. If Up > 2, we set 
w(eo) = w(e{) = 1. If Up = 1, then we set w{e'g) = 0 and w(e{) = 1. Otherwise, we set 
w(e}) = w(e[) = 0. We claim that Gnew has a b-CMC solution S' of cut weight at least ijj 
if and only if G has a solution S of cut weight at least 

Let us first assume that there is a set A in G that is a solution to the b-CMC problem 
with cut weight ip. We now show that there exists a S' in Gnew that is a solution to the 
b-CMC problem with cut weight at least ip. The proof in this direction is done for three 
possible cases, based on the cardinality of S{S) n Ep. We note that |d(5') n Ep\ is < 2, since 
G[S'] must be connected. 

Case 1. |dG(«5') H Ep\ = 2. Note that since S is connected, we must have either (i) 
S C {vi,V 2 \ or (ii) (vg, U 3 } C S. In the former case, we set S' = {vnew} and the claim follows 
by the definition of w{e'g) and w{e'i). In the latter case, we set S' = S \ {wi,f 2 }- Since vi 
and V 2 are vertices of degree two, Gnew[S'\ is connected. Further, every edge e € 5g{S) \ Ep 
also belongs to 5g {S'). The claim follows once we observe that both e} and e} are in 
^Gn.^iS')- 

Case 2. |dG(>S) H Ep\ = 1. In this case, we must have either Vg € S ov V 3 € S but not 
both. Let us first assume Vg € S. We set S' = {SU {vnew}) \ {^ 15 ^ 2 }- K is clear that if G[S'] 
is connected, so is Gnew[S'\. Due to the removal of vi and V 2 , we have Sg{S) \ Sc^^^iS') = 
{ci} for some edge 6 ^ € Ep. On the other hand, due to the addition of v„ew, we have 
\ Sg{S) = {e'l} and the claim follows since w{e'i) > w{ei) for any a € Ep. Now 
assume that V 3 € S. In this case, we set S' = S \ {fi,f 2 }- Since Vnew ^ <S", we again have 
e} € <5 g„^„(S' 0 and the proof follows as above. 

Case 3. |(5G(S')ni!lp| = 0. In this case, one of the following holds, either (i) {vg,vi,V 2 ,V 3 } C 
S or (ii) (fo, Wi, ^ 2 ) I'sl H S' = 0. If the latter is true, the proof is trivial by setting S' = S. 
In the former case, we set S' = S \ {vi,V 2 } U {vnew}- The addition of Vnew maintains con¬ 
nectivity between vg and V 3 and hence since S is connected, so is S'. Further, we have 
Sg{S) = Sg„^„{S') since no edge in Sg{S) in incident on vi or V 2 . 

In order to prove the other direction, we assume that S' is a solution to the b-CMC 
problem on Gnew with a cut weight of ip. We now construct a set S that is a solution to 
b-CMC on G of weight at least ip. The proof proceeds in three cases similarly. 

Case 1. Both e'g € dG^^^iS') and e} € dG^^^iS'). One of the following holds - (i) 
S' = {vnew} or (ii) {wo,r’ 3 } C S'. In the former case, let S be the subset of {vi,V 2 } having 
the largest weight cut. By construction, we have that weight of the cut d{S) is at least the 
sum of weights of e'g and e}. For the latter, let S to be the best among S', S' U {wi}, and 
S' U 1 ^ 2 } and the proof follows as above. 

Case 2. Either eg € dG„^,„(S') or e} G but not both. Let Cmax be the edge of 

maximum weight in Ep. The edge e^ax splits the path p into two connected components 
one containing vg, call it pg and the other containing V 3 , call it pg. Now to construct S, we 
delete Vnew from S' (if it contains it) and add the component pg if vg G S' or the component 
p 3 if V 3 G S'. Again connectivity is clearly preserved. We now argue that the cut weight is 
also preserved. Indeed, this is true since we have that w{emax) > 'max{we'g,We[) and the 
rest of the cut edges in S' remain as they are in S. 



Case 3. None of Cg, e'l belong to this case, if Vnew ^ S', then trivially S = S' 

works. Otherwise, we set S = S' LS {vi,V 2 }- It is easy to observe that both connectivity and 
all the cut edges are preserved in this case. 

Now, to construct G', we repeatedly apply the above contraction as long as possible. 
This will clearly take polynomial time as in each iteration, we reduce the number of degree- 
2 vertices by 1. Hence we have the claim. □ 


Spanning Tree with Many Leaves. Assuming that the graph has no long d-2 paths, the 
following lemma shows that we can find a spanning tree T that has I7(n) leaves. Note that 
Claim 1 now guarantees that there are ^2{n) 1-edges incident on the leaves of T. 

Lemma 3 Given a graph G = (V, E) with no d-2 paths of length > 3, we can obtain, in 
polynomial time, a spanning tree T = {V,Et) with at least leaves. 

Proof. Let T be any spanning tree of G. We note that although G does not have d-2 paths of 
length > 3, such a guarantee does not hold for paths in T. Suppose that there is a d-2 path 
p of length 7 in T. Let the vertices of this path be numbered vi,V 2 ,... ,vr and consider 
the vertices V 3 ,V 4 ,V 5 . Since G does not have any d-2 path of length 3, there is a vertex 
Vi, i G {3,4,5} such that dega{vi) > 3. We now add an edge e = {vi,w} in G\ T to the tree 
T. The cycle C that is created as a result must contain either the edge {vi,V 2 } or the edge 
{vqjV'i}. We delete this edge to obtain a new spanning tree T'. It is easy to observe that 
the number of vertices of degree two in T' is strictly less than that in T. This is because, 
although the new edge {ui, re} can cause w to have degree two in T', we are guaranteed that 
the vertex Vi will have degree three and vertices vi and V 2 (or vq and vy) will have degree 
one. Hence, as long as there are d-2 paths of length 7 in T, the number of vertices of degree 
two can be strictly decreased. Thus this process must terminate in at most n steps and the 
final tree obtained does not have any d-2 paths of length > 7. 

We now show that the tree contains I7(n) leaves by a simple charging argument. 
Let the tree be rooted at an arbitrary vertex. We assign each vertex of a token and 
redistribute them in the following way : Every vertex v of degree two in gives its token 
to its first non degree two descendant, breaking ties arbitrarily. Since there is no d-2 path 
of length > 7, each non degree two vertex collects at most 7 tokens. Hence, the number of 
vertices not having degree two in is at least y. Further, since the average degree of all 
vertices in a tree is at most 2, a simple averaging argument shows that must contain 
at least jj vertices of degree one, i.e., jj leaves. □ 

Obtaining an f2{ ) Approximation 

We now have all the ingredients required to obtain the ‘G(jyly-) approximation algorithm. 
We observe that if the graph G is sparse, i.e. r] < cn log n (for a suitable constant c), then the 
tree obtained by using Lemma 3 is an I7(jy^)-thick tree and thus we obtain the required 
approximate solution in this case. On the other hand, if the graph G is sparse, then we 
use Lemma 3 to obtain a spanning tree, delete the leaves of this tree, and then repeat 
this procedure until we have no more vertices left. Since, we delete a constant fraction of 
vertices in each iteration, the total number of iterations is O(logn). We then choose the 
“best” tree out of the O(logn) trees so obtained and show that it must be an a-thick tree, 
with a = ■G( iog„ )- Finally, using Lemma I, we obtain an approximate solution as 

desired. We refer to Algorithm I for the detailed algorithm. 



1 Input: Graph G = {V,E) 

2 Output: A subset SCI/, such that G[S] is connected 

3 Set Gi{Vi,Ei) = G, ni = \Vi\ 

4 Let rj <r- Number of 1-edges in G 

5 Use Lemma 3 to obtain a spanning tree Ti of Gi with leaf set Li 

6 if ?7 < cn log n then 

7 Use Lemma 1 on T\ to obtain a set connected S 

8 return S 

9 end 

10 i = 1 

11 while Gi ^ (j) do 

12 E,+i ^ Ei \ {E[Li\ U S{Li)) 

13 U-t-l ^ U \ Li, Ui + 1 = |Vi + l| 

14 Contract degree-2 vertices in Gi+i 

15 Use Lemma 3 to obtain a spanning tree of Gi+i with leaf set Li+i 

16 i = 

17 end 

18 Choose j = argmaxi(j]„g^. degoiv)) 

19 Use Lemma 1 on Tj to obtain a connected set S 

20 return S 


Algorithm 1: Finding a-thick trees 


Theorem 1. Algorithm 1 gives an approximate solution for the b-CMC problem. 


Proof. Let us assume that p < cn log n (for some constant c). Now, Lemma 3 and Claim 1 to¬ 
gether imply that ^g{v) = f^{n). Further, since we have w{S{OPT)) < p < cn log n, 

T is an a-thick tree for some a = Hence, we obtain an approximate solu¬ 

tion using Lemma 1. 

On the other hand, if 77 > cn log n, we show that at least one of the trees obtained 
by the repeated applications of the Lemma 3 is an a-thick tree T of G for a = 

We first observe that the While loop in Step 11 runs for at most O(logn) iterations. This 
is because we delete i7(ni) leaves in each iteration and hence after k = O(logn) iterations, 
we get Gfc = (j). We now count the number of 1-edges “lost” in each iteration. We recall 
that Wg{v) is the total number of 1-edges incident on u in a graph G. In an iteration i, 
the number of 1-edges lost at Step 12 is at most J2veL- ^gXv). In addition, we may lose 
a total of at most 2n < edges due to the contraction of degree two vertices in Step 

14. Suppose for the sake of contradiction that J2veLi ^g(^’) < diogn ’'^^ < * < where d is 
a suitable constant. Then the total number of 1-edges lost in fc = O(logn) iterations is at 
most 


k 

E(E WgAv)) + 

i—l v^Li 


2r] 

clog n 


<E 


V 

d\ogn 


2rj 

clogn 


77 77 

— 4 — , - 

d clogn 


< d 


The equality follows for a suitable constant d as k = 0{\ogn). The final inequality holds for 
a suitable choice of the constants c and d. But this is a contradiction since we have Gk = 4>. 

Since we choose j to be the best iteration, we have J2veLj ^g(c’) > diogn some 
constant d. Hence the tree Tj is an a-thick tree of G for a = and the theorem follows 

by Lemma 1. □ 




General Weighted Graphs 

We now consider the Weighted Connected Maximum Cut (WCMC) problem. Formally, we 
are given a graph G = (V, E) and a weight function w : E —>■ IR+ U {0}. The goal is to 
find a subset S of vertices that induces a connected subgraph and maximizes the quantity 
X]ee<5(S) obtain a „ ) approximation algorithm for this problem. Our basic 

strategy is to group edges having nearly the same weight into a class and thus create 0(log n) 
classes. We then solve the b-CMC problem for each class independently and return the best 
solution. 


1 Input: Connected graph G — {V,E) with \V\ = n and \E\ — m; Weight function, 
w : E ^ U 0 ; e > 0; 

2 Output: A subset S' C F, such that G[S] is connected; 

3 Let Wniax be the maximum weight over any edge of the graph; 

4 Define, wo = and Wi = woil + e)b for i G [logi+^ ^]; 

5 for i G [0, logj.^^ do 

6 for e € E do 

7 if Wi < w(e) < Wi+i then 

8 I w'(e) = 1; 

9 end 

10 else 

11 I w'(e) = 0; 

12 end 

13 Using Theorem 1, solve for the connected subset Si; 

14 end 

15 end 

16 return Sbest, such that best = argmax w(e); 

Algorithm 2: Algorithm for the Weighted Connected Maximum Cut problem. 


Theorem 1. Algorithm 2 gives a approximation guarantee for the Weighted Con¬ 

nected Maximum Cut problem. 


Proof. Let OPT be an optimal solution for a given instance of the problem and let = 
X]ee< 5 (OPT) ^(®)- Also, let e G (0,1]. Since we have that > Wmax, we can reset the 
weights of those edges with weight < to 0 and assume that Wmin > where 

Wrain denotes the weight of the minimum (non zero) weight edge. Let Ei be the set of 
edges e such that Wi < w{e) < Wi+i and hnally let OPTi = 6{OPT) n Ei. We now 


claim that J^e&OPT tu{e) = 0((1 + e) lognX(eG<5(S ) This immediately gives us that 

Z(ee<5(Sb„t) ^ ^ ) = ■^( ) (Z(eeOPT ) ■ 

We now prove the claim. Consider solving the b-CMC instance with weight function 


w'. Clearly OPT is a feasible solution to this instance and we have J2eeS{OPT)'^i(^) ~ 
J2eGOPTi w[{e) < 0(lognX;ee < 5 (S ) The previous inequality holds as Si is guaranteed 

to be an f7(j^^^)-approximate solution by Theorem 1. Now, we have 'Yhe&OPT tv{e) < 

{l + e)wiJ 2 e&OPTiW'iie) < 0{il + e)wAognJ2e&s{s,)W^i(^)) ^ 0((l-be) lognX;ee5(S.) «^(e))- 

Hence, the claim. □ 




3 CMC in Planar and Bounded Genus Graphs 


In this section, we consider the CMC problem in planar graphs and more generally, in graphs 
with genus bounded by a constant. We show that the CMC problem has a PTAS in bounded 
genus graphs. 


PTAS for Bounded Genus Graphs. 

We use the following (paraphrased) contraction decomposition theorem by Demaine, Haji- 
aghayi and Kawarabayashi [11]. 

Theorem 2. ([11]) For a bounded-genus graph G and an integer k, the edges of G can be 
partitioned into k color classes such that contracting all the edges in any color class leads 
to a graph with treewidth 0{k). Further, the color classes are obtained by a radial coloring 
and have the following property: If edge e = {u, v) is in class i, then every edge e' such that 
e' D e ^ (p is in class i — 1 or i ori + 1. 

Given a graph G of constant genus, we use Theorem 2 appropriately to obtain a graph 
H with constant treewidth. In Appendix A, we show that one can solve the CMC problem 
optimally in polynomial time on graphs with constant treewidth. 


Theorem 3. If the CMC problem can he solved optimally on graphs of constant treewidth, 
then there exists a polynomial time (1 — e) approximation algorithm for the CMC problem 
on bounded genus graphs (and hence on planar graphs). 

Proof. Let G = {V, E) be the graph of genus bounded by a constant and let S denote the 
optimal CMC of G and if = |(5(5')| be its size. Using Theorem 2 with /c = |, we obtain a 
partition of the edges E into - color classes namely Gi, G 2 ,..., Ga. We further group three 
consecutive color classes into ^ groups Gi,... ,Gi where Gj = G 3 j_ 2 UG 3 j_iUG 3 j. Let Gj» 
denote the group that intersects the least with the optimal connected max cut of G, i.e., 
j* = argmin^(|Gj n(5(S')|)®. As the 1 groups partition the edges, we have [G^* 0(5(5')! ^ 

Let i = 3j* — 1, so that Gj* = Gi_i U Gi U Gi+i. Let H = {Vh,Eh) denote the graph of 
treewidth 0(1) obtained by contracting all edges of color Gi. 

We first show that H has a CMC of size at least (1 — e)ip. For a vertex v G Vh, let 
p,{v) C V denote the set of vertices of G that have merged together to form v due to the 
contraction. We define a subset S' C Vh as S' = {v G Vh \ /i(u) 0 5 7 ^ </>}. Note that 
because we contract edges (and not delete them), S' remains connected. We claim that 
|<5(5')| > (1 — e)^/>. Let e = (u, v) be an edge in 5(5). Now e ^ 5(5') implies that at least one 
edge e' such that e' Cie ^ 4> has been contracted. By the property guaranteed by Theorem 2, 
we have that e G Gj*. Hence we have, |5(5')| > |5(5)\Gj. | = |5(5)| — |Gj. 05(5)| > — 

Finally, given a connected max cut of size ip in H, we can recover a connected max cut 
of size at least in G by simply un-contracting all the contracted edges. Hence, by solving 
the CMC problem on H optimally, we obtain a (1 — e) approximate solution in G. □ 


® We “guess” j* by trying out all the ( possibilities 



NP-hardness in planar graphs 

We now describe a non-trivial polynomial time reduction of a 3-SAT variant known as Planar 
Monotone 3-SAT (PM-3SAT) to the CMC problem on a planar graph, thereby proving that 
the latter is NP-hard. The following reduction is interesting as the classical Max-Cut problem 
can be solved optimally in polynomial time on planar graphs using duality. In fact, it was 
earlier claimed that even CMC can be solved similarly [21], 

An instance of PM-3SAT is a 3-CNF boolean formula <j) such that - 

a) A clause contains either all positive literals or all negative literals. 

b) The associated bipartite graph is planar. 

c) Furthermore, has monotone, rectilinear representation. We refer the reader to Berg 
and Khosravi [10] for a complete description. Figure la illustrates the rectilinear repre¬ 
sentation by a simple example. 



(a) Monotone Rectilinear Representation 



(b) Reduction of PM- 3 SAT to a Planar CMC 
instance 


Fig. 1: Example illustrating the rectilinear representation and the reduction to a Planar CMC 
instance of the formula (a:iVx 2 Va; 5 )A(x 2 Va: 3 Va; 4 )A(ari Var 2 Var 3 )A(x 3 Var 4 Var 5 )A(ari VafaVaTs). 


Given such an instance, the PM-3SAT problem is to decide whether the boolean formula 
is satisfiable or not. Berg and Khosravi [10] show that the PM-3SAT problem is NP-complete. 

The Reduction. Given a PM-3SAT formula </>, with a rectilinear representation, we obtain 
a polynomial time reduction to a Planar CMC instance, there by showing that the latter is 
NP-hard. Let {a;j}[L^ denote the variables of the PM-3SAT instance and {Cj}JLi denote 
the clauses. We construct a planar graph as follows. For every variable Xi, we construct 
the following gadget: We create two vertices v{xi) and v{xi) corresponding to the literals 
Xi and Xi- Additionally, we have K > rri^ “helper” vertices, h\, h^, ■ ■ ■, such that each 
h\ is adjacent to both Xi and Xi. Further, for every h}. we add a set L), of K new vertices 
that are adjacent only to h],. Now, in the rectilinear representation of the PM-3SAT, we 
replace each variable rectangle by the above gadget. For two adjacent variable rectangles 
in the rectilinear representation, say Xi and we connect the helpers and For 

every clause Cj, has a corresponding vertex v{Cj) with edges to the three literals in 
the clause. Finally, for each vertex v{Cj), we add a set Lj of y/K new vertices adjacent 

^ has a vertex for each clause and each variable and an edge between a clause and the variables 
that it contains 








only to v{Cj). It is easy to observe that the reduction maintains the planarity of the graph. 
Figure lb illustrates the reduction by an example. 

We show the following theorem that proves the Planar Connected Maximum Cut problem 
is NP-hard. 

Theorem 4. Let Hfi, denote an instance of the planar CMC problem corresponding to an 
instance (j) of PM-3SAT obtained as per the reduction above. Then, the formula 4> is satisfiable 
if and only if there is a solution S to the CMC problem on with |5/r^(>S')| > m.'/K + 
nK + . 

Proof. For brevity, we denote as S{S) in the rest of the proof. 

For-ward direction. Assume that (p is satisfiable under an assignment A. We now show 
that we can construct a set S with the required properties. Let {li}i^[n] be the set of literals 
that are true in A. We define S = {w(^i)}ie[n] U {Cj}jg[m] U i-®-) the set 

of vertices corresponding to the true literals, all the clauses and all the helper vertices. By 
construction, the set of all helper vertices and one literal of each variable induces a connected 
subgraph. Further, since in a satisfying assignment every clause has at least one true literal, 
the constructed set S is connected. We now show that |(I(S')| > mVK + nK + nK‘^. Indeed, 
(5(5') contains all the edges corresponding to the one degree vertices incident on clauses and 
all the helpers. This contributes a profit of mV K + nK^. Also, since no vertex corresponding 
to a false literal is included in 5 but all helpers are in 5, we get an additional profit of K 
for each variable. Hence, we have the claim. 

Reverse direction. Assume that 5 is a subset of vertices in such that H,p[S] is 
connected and |5(5)| > my/K+nK+nK‘^. We now show that p is satisfiable. We may assume 
that 5 is an optimal solution (since optimal solution will satisfy these properties, if a sub- 
optimal solution does). We first observe that at least one of the (two) literals for each variable 
must be chosen into 5. Indeed, if this is not the case for some variable, for H,p[S] to be remain 
connected, none of the helper vertices corresponding to that variable can be chosen. This 
implies that the maximum possible value for |5(5)| < (n — 1)K‘^ + m\fK -\- 3m + 2{n — 1)K 
(this is the number of remaining edges) < nK^ (since K > m^) < m^fK + nK + nK^, a 
contradiction. We now show that every helper vertex must be included in 5. Assume that 
this is not true and let h\ be some helper vertex not added to 5. We note that none of the 
K degree one vertices in can be in 5 because H,p[S] must be connected. Now, consider 
the solution S' formed by adding h], to 5. Since at least one vertices v(xi) or v{xi) is in 5, 
if H,p[S] is connected, so is H,p[S']. Further, the total number of edges in the cut increases 
by AT — 2. This is a contradiction to the fact that 5 is an optimal solution. Hence, every 
helper vertex h], belongs to the solution 5. We now show that, no two literals of the same 
variable are chosen into 5. Assume the contrary and let v(xi), v{xi) both be chosen into 5. 
We claim that removing one of these two literals will strictly improve the solution. Indeed, 
consider removing v{xi) from 5. Clearly, we gain all the edges from v{xi) to all the helper 
vertices corresponding to this variable. Thus we gain at least K edges. We now bound the 
loss incurred. In the worst case, removing v{xi) from 5 might force the removal of all the 
clause vertices due to the connectivity restriction. But this would lead to a loss of at most 
m^/K-\-3m < K. Hence, we arrive at a contradiction that 5 is an optimal solution. Therefore, 
exactly one literal vertex corresponding to each variable is included in 5. Finally, we observe 
that all the clauses must be included in 5. Assume this is not true and that m' < m clause 
vertices are in 5. Now the total cut is nK + nK^ + m'VK < nK + nK^ + m\/K, which is 
again a contradiction. Now, the optimal solution 5 gives a natural assignment to the PM- 
3SAT instance: a literal is set to TRUE if its corresponding vertex is included in 5. Since, 



every clause vertex belongs to S, which in turn is connected, it must contain a TRUE literal 
and hence the assignment satisfies (j). □ 
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A Dynamic program for constant tree-width graphs 

The notion of tree decomposition and tree-width was first introduced by Robertson and 
Seymour [30]. Given a graph G — {V,E), its tree decomposition is a tree representation 
T = {B,£), where each b & B (called as a bag) is associated with a subset Bb CV such that 
the following properties hold: 

1 - = 

2. For every edge u,v G E, there is a bag b G B, such that u,v G B^. 

3. For every u GV, the subgraph r„ of T, induced by bags that contain u, is connected. 

The width of a decomposition is defined as the size of the largest bag b G B minus one. 
Treewidth of a graph is the minimum width over all the possible tree decompositions. In 
this section, we show that the CMC problem can be solved optimally in polynomial time on 
graphs with constant treewidth t. 

Notation. We denote the tree decomposition of a graph G = {V,E) by T = (B,£). For a 
given bag of the decomposition b G B, let Bh denote the set of vertices of G contained in b 
and 14 denote the set of vertices in the subtree of T rooted at b. As shown by Kloks [24], we 
may assume that T is nice tree decomposition, that has the following additional properties. 


1. Any node of the tree has at most 2 children. 

2. A node b with no children is called a leaf node and has \Bi,\ = 1. 



3. A node b with two children ci and C 2 is called a join node. For such a node, we have 
Bb = Sci = Sc2. 

4. A node b with exactly one child c is either a forget node or an introduce node. If 6 is a 
forget node then Bb = Bc \ {n} for some v € Be- One the other hand, if b is an introduce 
node then Bb = BcU {n} for v ^ Be- 

We now describe a dynamic program to obtain the optimal solution for the CMC prob¬ 
lem. Let OPT denote the optimal solution. We first prove the following simple claim that 
helps us define the dynamic program variable. 


Claim. For any bag b G B, the number of components induced by OPT n Vj, in G is at most 

t. 

Proof. Consider the induced subgraph G[OPT n Vb] and let C be one of its components. We 
observe that C has at least one vertex in Bb, i.e., C f] Bb (j)- Assume this is not true and 
CCiBb = (f. Now consider an edge e = {u, v) such that u G C and v G OPT\C. Such an edge 
is guaranteed to exist owing to the connectivity of G[OPT]. By our assumptions, v ^ Vj,- 
This implies there is some bag h' not in the subtree of T rooted at b, that contains both u 
and V. But this in turn implies u G Bb, a contradiction to the assumption C H Bb = (p. Now, 
since each vertex in Bb belongs to at most one component, there can be at most \Bb\ < t 
components in G[OPT n 14]. Hence, the claim. 

For a given b G B, let Sb = OPT n Bb be the set of vertices chosen by the optimal solution 
from the bag b. Further, let Pb = (Gi, G 2 ... Ct) be a partition, of size t, of the vertices in 
Sb, such that each non-empty Gi (some of the Gfs could possibly be empty) is a subset of 
a unique component of the subgraph induced by 14 C OPT. We now define the variable of 
the dynamic program Mb(Pb,Sb) in the following way: Consider the subgraph induced by 
14 in G and let A be a subset of I 4 with maximum cut ^g[\ 4](‘5'), such that every Ci G Pb 
is completely contained in a distinct component of G[S']. We set Mb{Pb,Sb) = |'5G[Vb](>5')|- 
From this definition, it follows that the optimal solution can be obtained by computing 
Mr{Pr = {S,(j),(j),... ,(j)),S), for every subset S of Br, where r is the root bag of the tree T 
and picking the best possible solution. We now describe the dynamic program to compute 
the above variable Mb{Pb, Sb) for a given bag b. 

Case 1: Node b G T \s a leaf node. In this case, Bb = Vb = {u}, for some vertex v. 

Mbii{v},(l),...(l)),{v}) = 0 
Mbiicj), (l),...(p),(j)) = 0 

Case 2: Node 6 € T is an introduce node. In this case, b has exactly one child node c and 
Bb = Be C {u}, for some vertex v. Let Pb = (Gi,G 2 ■. .Ct) be some partition of Sb C Bb. 
We compute Mb{Pb,Sb) as follows. 

If u ^ Sb, then 

Mb{Pb, Sb) = MeiPb, Sb) + Sb)\ 

If V G Ci and v is adjacent to some vertex in Gi \ {fi} but not to any vertex in Cj,j yf i, 
then 


Mb{Pb, Sb) = M,((Gi, G 2 ,..., G, \ {u},..., Ct), Sb \ {u}) + |5G[Bq(^^, Bb \ ^fc)] 



In all other cases, we set 


Mb{Pb,Sb) = -oo 


We now argue about the correctness of this case. First assume that v ^ Sb, that is v is not 
chosen into our solution. If S' C I 4 is the set of vertices chosen into our solution so far, the 
total cut size increases by |(5G[y^](u, S)|, which we claim is equal to St,)|. In other 

words, none of the edges incident on v are adjacent to any vertices in S \ Sb- Suppose for 
the sake of contradiction that there exists a vertex w G S\Sb he such that {u, w} € E. This 
implies that there exists a bag 5', not in the subtree of b, that contains both v and w. This 
in turn implies that w € Sb, which is a contradiction to the fact that w € S \ Sb- Hence, in 
this case the total cut increases by |(5 g[b,^](v, 5't,)|. 

Now assume that v G Sb, more specifically, let v G Ci, for some i- From the above 
argument there are no edges between v and vertices in S \ Sb- Since we must have all 
vertices in Ci in a single component of G[I4] and all edges incident on v in G[I4] have the 
other end in Bb, v must have an edge to some vertex in Ci \ {v}. Further, since any Ci and 
Cj must belong to distinct components, they must not share any vertices. Thus, if v either 
has no edges to Ci \ {u} or has an edge to some Cj, there is no feasible solution and we 
assign —00 to this variable. On the other hand if both these conditions are satisfied, our 
solution is valid and the increase in the cut size is |(I( 5 [V 5 ] (?;, Bb \ >S'b)|. 

Case 3: Node 6 e T is a forget node. In this case, again, b has exactly one child node c with 
Bb = Bc \ {?^}. Let Pb = (Gi, G 2 ... Gt). It is easy to see that: 


Mb{Pb,Sb) = max 


Mc{Pb, Sb) 

Me((Gi, G2,..., G, U {u}, ..., Ct), Sb U {u}), Wi G [t] 


Case 4: Node 6 G T is a Join node. In this case, b has two children ci,C 2 with Bb = 
Bci = Bc 2 - Let Pb = (Gi, G 2 - - - ,Ct) be the partition of vertices in Sb such that each Ci 
belongs to a distinct component in G[OPT n Vb\- Similarly, define Pci = (Gi,G 2 ---C}) 
and Pc 2 = (C'l I C*! • • ■ C't) as partitions of = Sb and Sc^ = Sb respectively. Consider the 
construction of following auxiliary graph, that we refer to as “merge graph”, denoted by M- 
For every C} and C'f, we have a corresponding vertex u(G/) and r’(Gf) respectively in M. 
Further if two sets Cl and G| intersect, i.e.. Cl Cl Cj (/>, then we add an edge between 
v{Cl), v{Cj) in M - It is easy to observe that for a given component C of M, the union of all 
subsets of vertices corresponding to the vertices of G must belong to the same component 
of G[OPT n Vb\- Thus in turn implies that there is a one to one correspondence between 
Ci G Pb and the components of M. 

For a given partition Pb of Sb, we call two partitions P^ and Pc 2 of Sci and Sc 2 as 
“valid” if there is a one to one correspondence between Ci G Pb and the components of M, 
as described above. We now prove the following simple claim. 


Claim- For any S C I4, Sc[Vt]{S,Vb \ S) = Sg[v„^]{Si,Vci \ ^i) + (5 g[\ 4j(S'2,142 \ B2) - 
<lG[Vi,] {Sb, Bb \ Sb), where = S' n Vj-i and S2 = S n ^2 


Proof- From the properties of tree decomposition, it follows that for any two vertices u G 
Vci \ Bb and w G Hca \ Bb, uw ^ E- Further all the edges in S{Vci \ Bb) and 5 {Vc^ \ Bb) are 



incident on the vertices in Bf,. Thus we have the following equation, 


<^g[V6]('S') H \ S) =(5g[\4j(>5'i \ Bb, Vci \ Si) + (5g[142](‘^2 \ Vc^ \ S 2 ) + <5G[V6](5'b, B^ \ Sb) 
=(<5g[ 14 J ('5'i \ Bb, 14i \ S'!) + <5G[Vi,] (Sb, Bb \ Sb)) 

+ (<5g[Vci](‘^2 \ Bb, Vc2 \ S 2 ) + (5g[V6]('S'&: Bb \ Sb)) - 6a[Y,^](Sb, Bb \ Sb) 
=<^G[\4i](‘^l> ^ci \ -S'!) + ^G[yc2]('^2, 1^2 \ S 2 ) - (5G[Vt](>S'b, Bb \ Sb) 

We can now compute the dynamic program variable, in this case, as follows. 

Mb{Pb, Sb) = max M,, {P ,,, Sb) + M,, (P ,,, St) - |<5g[h] (Sb, Bb \ Sb) \ 

{Pci , Pc2 ) valid 
with respect to Pb 


B Maximum Leaf Spanning Tree 

In the Maximum Leaf Spanning Tree problem, we are given an undirected graph G = (V, E) 
and the goal is to find a spanning tree that has a maximum number of leaves. We now show 
that this well studied problem is a special case of the Connected Submodular Maximization 
problem. Consider maximizing the submodular function /(S') = |{v | v € N(S) \ S}| where 
N(S) is the set of vertices that have a neighbor in S. Now, it is easy to observe that 
there is a tree with L leaves if and only if there is a connected set S such that /(S) = L. 
Further, we show that without loss of generality for any solution S to Connected Submodular 
Maximization problem, we have that S U N(S) = V and hence the corresponding tree is 
spanning. Suppose S is a feasible solution and V ^ SUN(S), then there must exist an edge 
(m, v ) such that u € N(S') \ S and v ^ S U N(5'). Now 5" = S' U {«} is also a feasible solution 
and /(S) = /(S'). 

C High Connectivity 

We observe that the Connected Submodular Maximization has a constant factor approxima¬ 
tion algorithm if the graph has high connectivity. Let S C C be a random set of vertices 
such that every vertex v is chosen in S independently with probability \. As shown by Feige 
et al. [15], E[/(S)] > (|)/(OPr) where the f{OPT) is the maximum value of /. Further, 
as shown by Censor-Hillel et al. [5] if the graph G has l7(logn) vertex connectivity, the set 
S obtained above is connected with high probability. Hence, S is a ^ approximate solution 
to the Connected Submodular Maximization problem. 



